566 research outputs found
Recovering non-local dependencies for Chinese
To date, work on Non-Local Dependencies (NLDs) has focused almost exclusively on English and it is an open research question how well these approaches migrate to other languages. This paper surveys non-local dependency constructions in Chinese as represented in the Penn Chinese Treebank (CTB) and provides an approach for generating
proper predicate-argument-modifier structures including NLDs from surface contextfree phrase structure trees. Our approach recovers non-local dependencies at the level
of Lexical-Functional Grammar f-structures, using automatically acquired subcategorisation frames and f-structure paths linking antecedents and traces in NLDs. Currently our algorithm achieves 92.2% f-score for trace
insertion and 84.3% for antecedent recovery evaluating on gold-standard CTB trees, and 64.7% and 54.7%, respectively, on CTBtrained state-of-the-art parser output trees
Treebank-based acquisition of Chinese LFG resources for parsing and generation
This thesis describes a treebank-based approach to automatically acquire robust,wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing
and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena and (in cooperation with PARC) develop a gold-standard dependency-bank of Chinese f-structures for evaluation. Based on the Penn Chinese Treebank, I design and implement two architectures for inducing Chinese LFG resources, one annotation-based and the other dependency conversion-based. I then apply the f-structure acquisition algorithm together with external, state-of-the-art parsers to parsing new text into "proto" f-structures. In order to convert "proto" f-structures into "proper" f-structures or deep dependencies, I present a novel Non-Local Dependency (NLD) recovery algorithm using subcategorisation frames and f-structure paths linking antecedents and traces in NLDs extracted from the automatically-built LFG f-structure treebank. Based on the grammars extracted from the f-structure annotated treebank, I develop a PCFG-based chart generator and a new n-gram based pure dependency generator to realise Chinese sentences from LFG f-structures.
The work reported in this thesis is the first effort to scale treebank-based, probabilistic Chinese LFG resources from proof-of-concept research to unrestricted, real
text. Although this thesis concentrates on Chinese and LFG, many of the methodologies, e.g. the acquisition of predicate-argument structures, NLD resolution and
the PCFG- and dependency n-gram-based generation models, are largely language and formalism independent and should generalise to diverse languages as well as to labelled bilexical dependency representations other than LFG
Treebank-based acquisition of LFG resources for Chinese
This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof-
concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially extend and improve on this earlier research as regards coverage, robustness, quality and fine-grainedness of the resulting LFG resources. We achieve this through (i) improved LFG analyses for a number of core Chinese phenomena; (ii) a new automatic f-structure annotation architecture which involves an intermediate dependency representation; (iii) scaling the approach from 4.1K trees in CTB2 to 18.8K trees in CTB version 5.1 (CTB5.1) and (iv) developing a novel treebank-based approach to recovering non-local dependencies (NLDs) for Chinese parser output. Against a new 200-sentence good standard of manually constructed f-structures, the method achieves 96.00% f-score for f-structures automatically generated for the original CTB trees and 80.01%for NLD-recovered f-structures generated for the trees output by Bikelās parser
An analysis of question processing of English and Chinese for the NTCIR 5 cross-language question answering task
An important element in question answering systems is the analysis and interpretation of questions. Using the NTCIR 5 Cross-Language Question Answering (CLQA) question test set we demonstrate that the accuracy of deep question analysis is dependent on the quantity and suitability of the available linguistic resources.
We further demonstrate that applying question analysis tools developed on monolingual training materials to questions translated Chinese-English and English-Chinese using machine translation produces much reduced effectiveness in interpretation of the question. This latter result indicates that question analysis for CLQA should primarily be conducted in the question language prior to translation
ACTS in Need: Automatic Configuration Tuning with Scalability Guarantees
To support the variety of Big Data use cases, many Big Data related systems
expose a large number of user-specifiable configuration parameters. Highlighted
in our experiments, a MySQL deployment with well-tuned configuration parameters
achieves a peak throughput as 12 times much as one with the default setting.
However, finding the best setting for the tens or hundreds of configuration
parameters is mission impossible for ordinary users. Worse still, many Big Data
applications require the support of multiple systems co-deployed in the same
cluster. As these co-deployed systems can interact to affect the overall
performance, they must be tuned together. Automatic configuration tuning with
scalability guarantees (ACTS) is in need to help system users. Solutions to
ACTS must scale to various systems, workloads, deployments, parameters and
resource limits. Proposing and implementing an ACTS solution, we demonstrate
that ACTS can benefit users not only in improving system performance and
resource utilization, but also in saving costs and enabling fairer
benchmarking
Poly(ethylene glycol)-conjugated surfactants promote or inhibit aggregation of phospholipids
AbstractThe calcium-induced aggregation of dilauroyl phosphatidic acid (DLPA) suspensions, with or without added poly(ethylene oxide) (PEO)-conjugated surfactants containing 4 to 30 ethylene oxide subunits, were monitored by turbidity measurement and quasi-elastic light scattering (QLS). The aggregation was inhibited (protected) by the incorporated PEO surfactant for most samples, while a window for promotive effect was found for samples with low surface coverage by the PEO moiety of the incorporated surfactant. Promotion occurs only when the aggregation is slow and at a low level. The promotion is explained by the synergistic effect of PEO and divalent calcium cations when the steric repulsion is weak. The promotion/protection crossover is a display between the PEO/calcium synergistic effect and the steric repulsion
Live and let die: asymmetric dimethylarginine and septic shock
Nitric oxide (NO) is an important mediator of host defence and of vascular tone. In septic shock, upregulation of inducible NO synthase leads to the production of vast amounts of NO, which contribute to pathogen elimination but also to inappropriate vasodilation and to loss of vascular resistance. Asymmetric dimethylarginine (ADMA) is an endogenous inhibitor of NO synthases shown to contribute to the regulation of vascular tone. ADMA was recently identified as a marker of organ dysfunction and mortality in intensive care patients and as a novel cardiovascular risk factor. In the present issue of Critical Care, a study by O'Dwyer and colleagues identifies ADMA as a potential regulator of NO production in septic shock. Being an inhibitor of NO production, ADMA may at least partly counteract pathological hypotension, but at the same time may impair the NO-dependent host defence. A mechanism is proposed by which the interplay between ADMA and inducible NO synthase activity is mediated. ADMA levels should be determined in future studies evaluating the regulation of NO in the intensive care setting
BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning
An ever increasing number of configuration parameters are provided to system
users. But many users have used one configuration setting across different
workloads, leaving untapped the performance potential of systems. A good
configuration setting can greatly improve the performance of a deployed system
under certain workloads. But with tens or hundreds of parameters, it becomes a
highly costly task to decide which configuration setting leads to the best
performance. While such task requires the strong expertise in both the system
and the application, users commonly lack such expertise.
To help users tap the performance potential of systems, we present
BestConfig, a system for automatically finding a best configuration setting
within a resource limit for a deployed system under a given application
workload. BestConfig is designed with an extensible architecture to automate
the configuration tuning for general systems. To tune system configurations
within a resource limit, we propose the divide-and-diverge sampling method and
the recursive bound-and-search algorithm. BestConfig can improve the throughput
of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce
the running time of Hive join job by about 50% and that of Spark join job by
about 80%, solely by configuration adjustment
- ā¦